Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision

نویسندگان

Dushyant Mehta

Helge Rhodin

Dan Casas

Pascal Fua

Oleksandr Sotnychenko

Weipeng Xu

Christian Theobalt

چکیده

We propose a CNN-based approach for 3D human body pose estimation from single RGB images, that addresses the issue of limited generalizability of models trained solely on the starkly limited publicly available 3D pose data. We propose novel CNN supervision techniques, using a regularization structure while training that extends the concept of multi-level skip connections, and leverage first and second order parent relationships along the skeletal kinematic tree to learn better representations. We introduce a new training set for human body pose estimation from monocular images of real humans, that has the ground truth captured with a multi-camera marker-less motion capture system. It complements existing corpora with greater diversity in pose, human appearance, clothing, occlusion, and viewpoints, and enables an increased scope of augmentation. We also contribute a new benchmark that covers outdoor and indoor scenes. We further combine it with transfer learning from 2D pose human pose prediction to achieve even better generalization, and improve over the state-of-the-art on standard benchmarks by more than 25%. We argue that the use of transfer learning of representations in tandem with algorithmic and data contributions is crucial for general progress along many different dimensions of the problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

This paper addresses the problem of 3D human pose estimation in the wild. A significant challenge is the lack of training data, i.e., 2D images of humans annotated with 3D poses. Such data is necessary to train state-of-the-art CNN architectures. Here, we propose a solution to generate a large set of photorealistic synthetic images of humans with 3D pose annotations. We introduce an image-based...

متن کامل

Using a single RGB frame for real time 3D hand pose estimation in the wild

We present a method for the real-time estimation of the full 3D pose of one or more human hands using a single commodity RGB camera. Recent work in the area has displayed impressive progress using RGBD input. However, since the introduction of RGBD sensors, there has been little progress for the case of monocular color input. We capitalize on the latest advancements of deep learning, combining ...

متن کامل

Camera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images

In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4...

متن کامل

Human Context: Modeling Human-Human Interactions for Monocular 3D Pose Estimation

Automatic recovery of 3d pose of multiple interacting subjects from unconstrained monocular image sequence is a challenging and largely unaddressed problem. We observe, however, that by tacking the interactions explicitly into account, treating individual subjects as mutual “context” for one another, performance on this challenging problem can be improved. Building on this observation, in this ...

متن کامل

Advancing human pose and gesture recognition

This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision

نویسندگان

چکیده

منابع مشابه

MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

Using a single RGB frame for real time 3D hand pose estimation in the wild

Camera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images

Human Context: Modeling Human-Human Interactions for Monocular 3D Pose Estimation

Advancing human pose and gesture recognition

عنوان ژورنال:

اشتراک گذاری